| Node | In.Degree | Out.Degree |
|---|---|---|
| 1 | 3 | 0 |
| 2 | 3 | 1 |
| 3 | 1 | 2 |
| 4 | 0 | 2 |
| 5 | 0 | 2 |
October 31, 2025
- Flights
- Protein Interactions
- Citations
- CRAN
| Node | In.Degree | Out.Degree |
|---|---|---|
| 1 | 3 | 0 |
| 2 | 3 | 1 |
| 3 | 1 | 2 |
| 4 | 0 | 2 |
| 5 | 0 | 2 |
Full Evolution
- Often unavailable or too costly to extract
- Contains a lot of information \(~\)
Snapshot
- More often available
- Much easier to work with
- Can still study key properties e.g. degrees
There has been a heated debate over the presence of scale-free networks in reality.
Scale-free Degree Distribution — Orginal scale
Scale-free Degree Distribution — Log scale
Several definitions of scale-free in the literature
Heavier than exponential
\[ \bar F (k) \gg e^{-ck} \]
Power-law \[ \bar F(k) \sim k^{-\gamma} \]
Regularly varying tail \[ \bar F(k) = k^{-\gamma}\mathcal L (k) \]
\[ K\sim \text{Pareto}(\gamma) \] \[ ~ \] Estimate \(\gamma\)
\[ K|K>u \sim \text{GP}(\xi, \sigma) \] \[ ~ \] Estimate \((\xi,\sigma)\)
\[ K|K< u \sim \text{Pareto}(\gamma) \] \[ K|K>u \sim \text{GP}(\xi, \sigma) \] Estimate \((\gamma, u, \xi, \sigma)\)
Zipf-Polylog
\[ f(k) \propto k^{-\alpha} \theta^k, \qquad k=1, 2, \ldots, u \]
Integer Generalised Pareto
\[ F(k)= 1-\left(1+\frac{\xi(k-u)}{\sigma +\xi u}\right)_+^{-1/\xi}, \qquad k=u+1,\ldots \]
Using this model Lee, Eastoe, and Farrell (2024) found that networks often have a heavy tail but not quite as heavy as implied by the body.
Many use scale-freeness of a network to justify the mechanics behind the network’s growth.
Preferential Attachment (rich-get-richer) \(\Rightarrow\) Power-law degree distribution
but
Power-law degree distribution \(\nRightarrow\) Preferential Attachment (rich-get-richer)
This is fairly common as modelling the degrees does not generally inform the network growth.
Example
\[ \pi_i = \left. k_i \middle/ \sum_j k_j \right. \] where \(k_i\) is degree of node \(i\).
Results in a power-law degree distribution
\[ \bar F(k)\sim k^{-2} \]
Example
\[ \pi_i = \left. b(k_i) \middle/ \sum_j b(k_j) \right. \] where \(k_i\) is degree of node \(i\).
How do we study the degree distribution of this?
Consider the GPA model when \(m=1\), and a continuous time branching process \(\zeta(t)\) where:
\[ \Pr(\zeta(t+\text{d}t) = k+1 | \zeta(t) =k) = b(k) \text{d}t + o(\text{d}t) \]
This pure birth process corresponds to the growth of an individual nodes degree.
Construct the tree \(\Upsilon(t)\) determined by \(\zeta(t)\) such that:
Denote \(\Upsilon(t)_{\downarrow x}\) the tree when treating \(x\) as the root.
This construction is equivalent to the GPA model with \(m=1\) and preference function \(b(\cdot)\)
Rudas, Tóth, and Valkó (2007) states that for a given characteristc function \(\phi(\cdot)\)
\[ \lim_{t\rightarrow\infty} \frac{1}{\left|\Upsilon(t)\right|}\sum_{x\in\Upsilon(t)}\phi(\Upsilon(t)_{\downarrow x}) = \lambda^* \int_0^\infty e^{-\lambda^*}\mathbb E \left[\phi(\Upsilon(t))\right]\text{d}t \]
where \(\lambda^*\) satisfies
\[ \hat\rho(\lambda^*) = \int_0^\infty e^{-\lambda^*t}\rho(t)\text{d}t = 1 \]
and \(\rho(t)\) is the density of the point process associated with \(\zeta(t)\).
We can write the tail of the degree distribution at time \(t\) as:
\[ \frac{\sum_{x\in\Upsilon(t)} \mathbb I\left\{\text{deg}(x, \Upsilon(t)\downarrow x)>k \right\} }{\left|\Upsilon(t)\right|} \]
and so the tail of the limiting degree distribution as \(t\rightarrow \infty\) is:
\[ \bar F(k) = \lim_{t\rightarrow\infty} \frac{\sum_{x\in\Upsilon(t)} \mathbb I\left\{\text{deg}(x, \Upsilon(t)_{\downarrow x})>k \right\} }{\left|\Upsilon(t)\right|} \]
So we can now use the previous result to obtain:
\[ \bar F(k) = \lambda^* \int_0^\infty e^{-\lambda^*t}\mathbb E\left[\mathbb I\left\{\text{deg}(x, \Upsilon(t)_{\downarrow x})>k \right\}\right]\text{d}x \]
Through fairly simple calculations we get:
\[ \bar F(k) = \prod_{i=0}^k\frac{b(i)}{\lambda^*+b(i)} \]
where \(\lambda^*\) satisfies
\[ \hat \rho(\lambda^*) = \sum_{k=0}^\infty \prod_{i=0}^k\frac{b(i)}{\lambda^* + b(i)} = 1 \]
What effect does \(b(\cdot)\) have on this degree distribution, particularly in the tail?
We want to pay careful attention to the extreme values
Studies of empirical degrees often use methods from continuous extremes
As this is a discrete distribution we need some new machinery.
Frechet (heavy tailed)
\[ \bar F(x) = x^{-\gamma} \mathcal L(x) \] e.g. Pareto, Cauchy, Lévy
Gumbel (light-tailed)
\[ \lim_{x\rightarrow\infty}\frac{\bar F (x + t a(x))}{\bar F(x)}= e^{-t} \] e.g. Exponential, Normal, Gamma
All continuous distributions with infinite right endpoint fall into one of these.
Many discrete distributions don’t satisfy the conditions to belong to an MDA
Shimura (2012) uses the following quantity to categorise discrete distributions \[ \Omega(F, k) = \left(\log \frac{\bar F(k+1)}{ \bar F(k+2)}\right)^{-1} - \left(\log \frac{\bar F(k)}{ \bar F(k+1)}\right)^{-1} \]
Frechet
\[ \lim_{k\rightarrow\infty} \Omega(F,k) = 1/\alpha \] and \[ \lim_{k\rightarrow\infty} \frac{\bar F(k+1)}{\bar F(k)} = 1 \]
Gumbel
\[ \lim_{k\rightarrow\infty} \Omega(F,k) = 0 \] and \[ \lim_{k\rightarrow\infty} \frac{\bar F(k+1)}{\bar F(k)} = 1 \]
Recoverable to Gumbel
\[ \lim_{k\rightarrow\infty} \Omega(F,k) = 0 \] and \[ \lim_{k\rightarrow\infty} \frac{\bar F(k+1)}{\bar F(k)} \neq 1 \]
Now we can use the limiting survival function of a GPA model to show that:
\[ \lim_{k\rightarrow\infty} \Omega(F,k)= \begin{cases} \lim_{k\rightarrow \infty}\frac{b(k+1)-b(k)}{\lambda^*}, &\text{when } b(k) \rightarrow \infty\\ 0, &\text{otherwise} \end{cases} \]
and therefore for \(b(k)\rightarrow\infty\) as \(k\rightarrow\infty\)
- \(\bar F (k)\) is regularly varying (Frechet MDA) if and only if \(\lim_{k\rightarrow \infty}[b(k+1)-b(k)] = \nu >0\), in which case the tail index is \(\lambda^*/\nu\)
- \(\bar F(k)\) is light-tailed (Gumbel MDA) if and only if \(\lim_{k\rightarrow \infty}[b(k+1)-b(k)]=0\).
Note that if \(\lim_{k\rightarrow\infty} b(k)<\infty\), then \(\bar F(k)\) is recoverable to the Gumbel MDA.
Barabasi-Albert (BA) \[ b(k) = k+1 \]
\[ \lim_{k\rightarrow\infty} \Omega(F,k) = 1/\lambda^* = 1/2 \] \[ \lim_{k\rightarrow\infty} \frac{\bar F(k+1)}{\bar F(k)} = 1 \] Frechet, with index 2
Polynomial \[ b(k) = (k+1)^\alpha, \quad \alpha\in (0,1) \]
\[ \lim_{k\rightarrow\infty} \Omega(F,k) = 0 \] \[ \lim_{k\rightarrow\infty} \frac{\bar F(k+1)}{\bar F(k)} = 1 \] Gumbel
Uniform \[ b(k) = c, \quad c>0 \]
\[ \lim_{k\rightarrow\infty} \Omega(F,k) = 0 \] \[ \lim_{k\rightarrow\infty} \frac{\bar F(k+1)}{\bar F(k)} \neq 0 \] Recoverable to Gumbel
- \(\bar F (k)\) is regularly varying (Frechet MDA) if and only if \(\lim_{k\rightarrow \infty}[b(k+1)-b(k)] = \nu >0\), in which case the tail index is \(\lambda^*/\nu\)
- \(\bar F(k)\) is light-tailed (Gumbel MDA) if and only if \(\lim_{k\rightarrow \infty}[b(k+1)-b(k)]=0\).
We require something that is eventually linear but is flexible enough for real networks
Barabasi-Albert \[ b(k) = k+1 \]
Proposed Model \[ b(k) = \begin{cases} k^\alpha + \varepsilon, &k\le k_0\\ k_0^\alpha + \varepsilon + \beta(k-k_0), &k>k_0 \end{cases} \]
\[ \bar F(k) = \frac{2}{(k+2)(k+3)} \]
\[ \bar F(k) = \begin{cases} \prod_{i=0}^{k}\frac{i^\alpha + \varepsilon}{\lambda^*+i^\alpha + \varepsilon},&k\le k_0,\\ \left(\prod_{i=0}^{k_0-1}\frac{i^\alpha + \varepsilon}{\lambda^*+i^\alpha + \varepsilon}\right)\frac{\Gamma(\lambda^*+k_0^\alpha + \varepsilon)/\beta)}{\Gamma\left((k_0^\alpha + \varepsilon)/\beta\right)} \frac{\Gamma\left(k-k_0 + 1 +\frac{k_0^\alpha + \varepsilon}{\beta}\right)}{\Gamma\left(k-k_0 + 1 +\frac{\lambda^* +k_0^\alpha + \varepsilon}{\beta}\right)},&k > k_0, \end{cases} \]
All limiting degree distributions produced by this model are Frechet with tail-index \(\lambda^*/\beta\).
\[ \begin{aligned} L(\pmb n | \pmb \theta,l) = &\left(\frac{\lambda^*}{\lambda^*+\varepsilon}\right)^{n_0}\left(\prod_{j=l}^{k_0-1}\frac{j^\alpha +\varepsilon}{\lambda^* + j^\alpha +\varepsilon}\right)^{\left(\sum_{i\ge k_0}n_{i}\right)} \\ &\times \prod_{l \le i<k_0}\left(\frac{\lambda^*}{\lambda^* +i^\alpha + \varepsilon } \prod_{j=l}^{k_0-1}\frac{j^\alpha + \varepsilon}{\lambda^* + j^\alpha + \varepsilon}\right)^{n_i}\\ &\times \prod_{i\ge k_0}\left(\frac{\text{B}(i-k_0 + (k_0^\alpha + \varepsilon)/\beta,1+\lambda^*/\beta)}{\text{B}((k_0^\alpha + \varepsilon)/\beta,\lambda^*/\beta)}\right)^{n_i}, \end{aligned} \]
\[\begin{align*} \alpha&\sim \text{Gamma}(1,0.01),\\ \beta &\sim \text{Gamma}(1,0.01),\\ k_0 &\sim \text{U}(1,10,000),\\ \varepsilon &\sim \text{Gamma(1,0.01)}, \end{align*}\]
Now we use an adaptive Metropolis-Hastings to obtain posterior samples
Internet
Flights
Proteins
Data sourced from Network Data Repository(Rossi and Ahmed 2015)
We will fit the model and compare it to the mixture model.
Theory based on trees, which real networks rarely are
Only consider degrees in the snapshot
Assume a fairly restrictive model (pure birth)
Look beyond just the degrees
Incorporate more complex behaviour into the growth model
Extend theory beyond that of trees
Beyond the Descriptive Modelling of Degrees - Thomas Boughen